ceRNA network construction and identification of hub genes as novel therapeutic targets for age-related cataracts using bioinformatics

Background The aim of this study is to investigate the genetic and epigenetic mechanisms involved in the pathogenesis of age-related cataract (ARC). Methods We obtained the transcriptome datafile of th ree ARC samples and three healthy, age-matched samples and used differential expression analyses to identify the differentially expressed genes (DEGs). The differential lncRNA-associated competing endogenous (ceRNA) network, and the protein-protein network (PPI) were constructed using Cytoscape and STRING. Cluster analyses were performed to identify the underlying molecular mechanisms of the hub genes affecting ARC progression. To verify the immune status of the ARC patients, immune-associated analyses were also conducted. Results The PPI network identified the FOXO1 gene as the hub gene with the highest score, as calculated by the Maximal Clique Centrality (MCC) algorithm. The ceRNA network identified lncRNAs H19, XIST, TTTY14, and MEG3 and hub genes FOXO1, NOTCH3, CDK6, SPRY2, and CA2 as playing key roles in regulating the pathogenesis of ARC. Additionally, the identified hub genes showed no significant correlation with an immune response but were highly correlated with cell metabolism, including cysteine, methionine, and galactose. Discussion The findings of this study may provide clues toward ARC pathogenic mechanisms and may be of significance for future therapeutic research.


INTRODUCTION
As population age around the world trends upward, age-related cataracts (ARC) continue to be the primary cause of reversible severe vision impairment and blindness worldwide, with incidence increasing. The World Health Organization estimated that in 2014, there were 95 million people worldwide who were visually impaired due to cataracts (WHO, 2014), and a 2018 meta-analysis projected that the number of people affected by ARC would increase to 187.26 million (Song et al., 2018). The primary disease mechanisms of ARC formation are metabolic abnormalities and oxidative stress, mainly caused by ageing (Tang et al., 2015). There are also several risk factors for ARC formation including ultraviolet B exposure, some systematic disease factors (diabetes and hypertension), and some lifestyle factors such as smoking, drinking, and malnutrition (Liu et al., 2017). Currently, the only effective treatment method for ARC is the surgical removal of the onset lens and replacement of the intraocular lens. This procedure has a high success rate, but is also expensive. There are currently no effective therapeutic drugs for cataracts. With expensive surgery as the only treatment option available, cataracts are a significant medical and financial burden both at the individual and population levels. Further characterization of the pathogenesis of ARC is essential for developing new therapies.
In this study, we aimed to identify differentially expressed lncRNAs and the correlated hub genes to construct a differential lncRNA-associated competing endogenous (ceRNA) network and explore the underlying molecular mechanisms and therapeutic targets of ARC patients.

Differential Expression Genes (DEGs)
The lncRNA expression profiles were retrieved from the transcriptome sequencing data using the Gencode annotation file (https://www.gencodegenes.org/) from Perl software (version 30). The differentially expressed messenger RNA (mRNA) and lncRNA were obtained using the ''limma'' packages for R (version 4.2;R Core Team, 2022) and R studio (version 2022.02.2;R Core Team, 2022;RStudio Team, 2013) by comparing the normal and ARC groups with an adjusted p-value <0.05 after filtering the data with standard |log2 Fold Change|>1. The differently expressed genes were then used to explore the molecular mechanisms of ARC pathogenesis and development and the volcano map and correlation plot of the DEGs were drawn.

Functional enrichment analysis
In order to annotate specific genes and to identify the biological function and signaling pathways of the DEGs associated with the pathogenesis of ARC, the Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways were analyzed using the ''ClusterProfiler'' R package (based on p-value <0.05).

Immune infiltration-related analysis
The relative infiltration proportions of 22 immune cells in the ARC patients were estimated using the CIBERSORTx algorithm (https://cibersortx.stanford.edu/). The correlation and influence of these immune cell proportions were then analyzed using the ''corrplot'' R package and the relative percentage of the immune cells were compared by Wilcoxon test.
The relationship between the expression levels of the hub genes (mentioned below) and the number of immune cells was analyzed using the Spearman method with p < 0.05 considered significant.

The construction of the ceRNA network and PPI network
The potential microRNAs (miRNAs) that interact with the differential lncRNAs were predicted using the Perl software combined with the miRcode file (http://www.mircode.org/, the public transcriptome database). The miRDB (http://www.mirdb.org), mirtarbase (https://mirtarbase.cuhk.edu.cn/~miRTarBase/miRTarBase_2022/php/index.php), and TargetScan (http://www.targetscan.org) databases were used in combination to predict the target mRNA of the miRNA. Next, the intersection of miRNA-predicted-mRNAs and differential-mRNAs (co-identified target genes) was identified and used to construct the ceRNA and protein-protein interaction (PPI) network. Next, the five hub genes with the highest scores were identified using the Maximal Clique Centrality (MCC) algorithm (Chin et al., 2014); a topological analysis method introduced by Chin et al. that is good at predicting the proteins that form the yeast PPI network) from the Cytohubba plugin in Cytoscape (version 3.9.1).

Gene set enrichment analysis (GSEA) and construction of drug network on hub genes
The expression of each hub gene between the normal and ARC groups were compared and the correlations analyzed. The GSEA of each core gene was also performed to explore the biological signaling pathways (based on p-value <0.05). Additionally, protein-drug interaction data were retrieved from the DGIdb database (http://www.dgidb.org) and visualized using Cytoscape to predict potential therapeutic agents for ARC patients.

Differentially Expressed Genes (DEGs)
The process for identifying potential therapeutic targets for ARC is shown in Fig. 1. After annotating with the Gencode annotation file, 25,023 protein-coding RNAs and 143 lncRNAs were retrieved from the output expression files and used in the differential analysis. With the threshold of adjusted |log2(FC)|>1 and a p-value of 0.05, we identified a total of 187 differentially expressed mRNAs, including 56 up-regulated and 131 down-regulated genes. Additionally, four differentially expressed lncRNAs were identified, including two up-regulated-H19 and X-inactive specific transcript (XIST)-and two down-regulatedtestis-specific transcript Y-linked 14 (TTTY14) and MEG3. Figures 2A-2D shows the volcano and heatmap plots of the differentially expressed mRNAs (A, B) and lncRNAs (C, D).

Functional enrichment of DEGs
The GO and KEGG pathway enrichment analyses were conducted on the 187 differentially expressed mRNAs that were identified. The GO analysis identified 559 enriched pathways of significance in the biological processes (BP) category, 55 in the cellular components (CC) category, and 64 in the molecular functions (MF) category. The most significantly enriched terms in each of the categories were then identified (Fig. 3A). The most significantly enriched BP were: extracellular matrix organization, extracellular structure organization, and external encapsulating structure organization. The most significantly enriched CC were the collagen-containing extracellular matrix, endoplasmic reticulum lumen, and basement membrane. The most significantly enriched MF were: the extracellular matrix structural constituent, extracellular matrix binding, and peroxidase activity. The KEGG results identified 25 enriched pathways. The five psathways with the greatest enrichment were: human papillomavirus infection, breast cancer, the Wnt signaling pathway, glutamatergic synapse, and the TGF −beta signaling pathway (Fig. 3B).

Construction of ceRNA network and drug network on hub genes
The core ceRNA network was constructed with five hub genes, the nine miRNAs that interacted with the hub genes, and the four miRNA-targeted lncRNAs (shown in Fig. 8A). Figure 8B shows no significant correlation between immune cells and the five hub genes. A total of 26 drugs that interact with the identified ARC hub genes were acquired from the DGIdb database. Of the top five drugs identified with the greatest interaction score, one (Tarextumab) targets NOTCH3, and four (Dorzolamide, Brinzolamide, Ethinamate, and Sulthiame) target CA2. Four hub genes with drug interactions constitute the drug network visualized by Cytoscape shown in Fig. 9.

DISCUSSION
In this study, we analyzed the differentially expressed lncRNA in the anterior lens capsules of ARC patients and the control group using a differential expression analysis, functional  enrichment analysis, and an immune infiltration-related analysis. We then constructed a ceRNA network and identified five hub genes to serve as possible therapeutic targets for ARC progression. The GO enrichment of the DEGs in the BP were mainly enriched in extracellular matrix (ECM) organization. Because of the importance of the interactions between the ECM of the lens capsular cells during lens development, the disruption of the ECM or a change in cell signaling, especially transduced by integrins, may stunt lens growth and promote cataracts (Wederell & De Iongh, 2006). The KEGG pathway analysis of the DEGs identified the Wnt and TGF-beta signaling pathways, which is in accordance with the results of previous studies (Chen et al., 2021;Shi & Yang, 2021).
Our analysis showed no statistical differences in the percentage of immune cells between the normal and ARC groups. Our study used an estimated prediction of the abundance of different types of immune cells, and it is important to note that this finding was not experimentally verified and the data was not derived from a single-cell RNAseq experiment. Although a group of resident immune cells of the lens was found to be established during embryonic development and maintained in adulthood, the immune cells surveilling the lens are usually activated during eye injury, lens degeneration, and lens wounding, especially post cataract surgery (Stepp & Menko, 2021).
Competing endogenous RNA (ceRNA) has a complex regulatory network with rich biological functions, which has attracted extensive attention in the academic community. We constructed the ceRNA network using the differentially expressed lncRNA and the intersection of miRNA-predicted-mRNAs and differentially expressed mRNAs. Subsequently, the five hub genes with the highest scores, as calculated by the MCC method, were selected from the co-identified target mRNA of the PPI network. There was no significant correlation between the hub genes and the proportion of immune cells, which was consistent with the results of the immune analysis. Conversely, the GSEA demonstrated that the hub genes were highly correlated with cell metabolism, in line with previous research (Chen et al., 2014;Duncan & Wormstone, 1999;Stambolian, 1988;Truscott, 2005). Finally, the drug network was constructed to show the interaction between the identified proteins and possible drug therapies. There were four differentially expressed lncRNA identified in this study: H19 and XIST were up-regulated, and MEG3 and TTTY14 were down-regulated in the ARC patients. The H19 lncRNA was also found to be up-regulated in ARC patients in Cheng et al. (2019) and Liu et al. (2018), which is consistent with our study. These studies also found that the protective role of H19 lncRNA was associated with cell proliferation and apoptosis by regulating the miR-29a-TDG axis and miR-675-CRYAA axis. Fibrosis in the lens (EMT process, including the loss of epithelial cell integrity with abnormal proliferation, migration, and the cell morphology changing into more mesodermal-derived mesenchymal-like cells), increasing apoptosis resistance, and exaggerated ECM components production, is featured by the accumulation of excess connective tissue that destroys the normal structure and function of the lens (Lovicu, Shin & McAvoy, 2016).
In an in vitro study, Xiong et al. (2022) indicated the novel role of H19 lncRNA in inhibiting TGF-β2-induced EMT to prevent lens fibrosis. Previous studies have also demonstrated that Xist lncRNA can play a role in regulating X chromosome inactivation (XCI) and lead to the inheritable silencing of one of the X-chromosomes during female cell development. Xist lncRNA is also involved in tumor development and the progression of other diseases by acting as a ceRNA (Wang et al., 2021). Our study's functional enrichment results showed no significant enrichment in the X-chromosome, indicating that the ARC pathogenesis may be independent of the X-chromosome. Another study found that Xist lncRNA plays a protective role in diabetic cataracts by promoting cell proliferation and decreasing apoptosis through the miR-34a/SMAD2 (Wang, Zhao & Zhang, 2022). It is unclear in our study whether the Xist up-regulation that was observed was a causative or reactive protective factor in ARC. Another study showed that MEG3 lncRNA is a cataractogenesis molecule through the up-regulation of TP53INP1 in ARC patients (Tu et al., 2020). Though TTTY14 lncRNA has not been found in any cataract pathogenesis, previous studies have shown that it is associated with the progression of cancers (Gong et al., 2020;Kopczyńska et al., 2020;Li et al., 2017b), endometriosis (Bhat et al., 2019), and COVID-19 (Askari, Hadizadeh & Rashidifar, 2022, suggesting the critical role of TTTY14 in biological processes. Therefore, the molecular function of TTTY14 in ARC patients should be further studied. MicroRNAs (miRNAs) are a class of endogenous short non-coding RNAs (containing ∼22 nucleotides) that are part of the epigenome and post-transcriptional control gene expression through translational repression or mRNA degradation (Cai et al., 2009). The predicted target miRNAs of the hub genes associated the ceRNA network include miR-107, miR-129-5p, miR-135a-5p, miR-206, miR-23b-3p, miR-27a-3p, miR-3619-5p, miR-449c-5p, and miR-761. For example, the upregulated miR-23b-3p , which involved in apoptosis (Liu et al., 2022), autophagy (Zhou et al., 2019), and resistance to oxidative damage (Liang et al., 2020) in ARC via translational repression of key molecules, such as homeodomain interacting protein kinase 3 (HIPK3) and silent information regulator 1 (SIRT1). Importantly, the miRNAs were predicted by miRNA related database which need further experiment validation. FOXO1, one of the members of the FOXO subfamily of Forkhead transcription factors, is known as a cell response regulator to oxidative stress. Up-regulation of FOXO1 has been reported in high glucose-treated human lens epithelium cell (HLEC) lines contributing to cataractogenesis as a cell death-related gene and serving as the target of Lycium barbarum polysaccharide treatment (Yao et al., 2020). Conversely, another study (Zhu et al., 2018) with different glucose concentrations showed that up-regulating FOXO1 can prevent HLECs from oxidative damage induced by high glucose via beta-casomorphin-7 treatment. FOXO1 is also associated with choroidal neovascularization, and retina vein occlusions have also been reported (Chen et al., 2019;Zhou et al., 2021b). However, there is a lack of research on the role of up-regulated FOXO1 in ARC pathogenesis.
Notch proteins (NOTCH1-4) are a family of transmembrane receptors that play a vital role in both developmental and cell fate decisions (Artavanis-Tsakonas, Rand & Lake, 1999). The dysregulation of Notch proteins is involved in many diseases, such as cancer, cerebral autosomal dominant arteriopathy with subcortical infarcts and leukoencephalopathy, and pulmonary hypertension (Hosseini-Alghaderi & Baron, 2020). Additionally, Zhou et al. (2021a) proposed that the down-regulation of NOTCH3/Hes1 was related to the apoptosis of lens epithelial cells under cold stimulation, which supports the decreased NOTCH3 expression observed in ARC patients in our study. However, another study suggested that NOTCH3 can be directedly regulated by the toll-like receptor (TLR)-3, contributing to the process of EMT, triggering fibrotic cataracts (Xie et al., 2022). Thus, further studies of the NOTCH3 inactivation mechanism in ARC patients are needed.
It is universally accepted that Sprouty proteins with four mammalian orthologs (SPRY1-4) function as antagonists of receptor tyrosine kinase-induced signal transduction in organisms, which is essential to growth and development (Cabrita & Christofori, 2008). The down-regulation of SPRY2 in patients with ARC has been observed in a number of studies. Previous studies (Lovicu, Shin & McAvoy, 2016;Shin et al., 2012) have shown that SPRY2 is a protective factor as a negative regulator of transforming growth factor β-induced EMT and cataract formation, likely by regulating ERK1/2 and Smad2 (Tan et al., 2016;Zhao et al., 2022). Studies have also shown that SPRY2 interacts with microRNA (Liu et al., 2019;Liu et al., 2021), suggesting that it may play a crucial role in the EMT of lens epithelial cells in ARC patients.
The down-regulation of CDK6 and CA2 were also identified in our study. CDK6, which is highly correlated with the cell cycle, constitutes a complex with cyclin D and CDK inhibitor to control the G1 checkpoint through the phosphorylation of the retinoblastoma protein (pRb; Lukas, Bartkova & Bartek, 1996). A study exploring the expression and activity of CDK cells during lens differentiation (developing rat lenses) showed that Cdk6 was not expressed in lens fiber cells or epithelial cells during lens differentiation (Gao et al., 1999), but there has been little study on the relationship between CDK6 and ARC.
There are 15 known human isoforms of carbonic anhydrase (CA) with different functions and distributions, which belong to a class of metalloenzymes that catalyze carbon dioxide into bicarbonate. Human CA variants have been linked to glaucoma, macular edema, ulcers, obesity, and cancer (Cabaleiro-Lago & Lundqvist, 2020). A study (Wistrand, 1999) investigating human lens CA demonstrated that the CA activity originates from CA 1, 2 and 3 in the cytoplasm , and from CA4 in the plasma membranes of lens epithelium and fibers in normal patients, and there was no CA activity observed in the supernatant of senile cataract lenses. As studies (Wistrand, 1999;Wistrand, 2000) have shown that the chronic intake of CA inhibitor does not seem to induce lens opacification (signs of cataract), we hypothesize that the down-regulation of CA2 observed in our study may be a protective response to oxidative stress.
Some limitations in this study should be recognized. First, our results require further validation through cell and animal experiments. Second, the samples are limited and lack clinical signatures (such as postoperative vision, surgery complication, and visual quality), which needs further study by enlarging the samples combined with clinical information.